Open
Conversation
5 tasks
|
@maciejkula any help here? we have a dependency in Recommenders on this repo |
5 tasks
|
Some benchmarks seem to indicate that this commit broke parallelism. Detailsimport modal
app = modal.App("lightfm-openmp-benchmark")
# Create image with LightFM compiled with OpenMP
image = (
modal.Image.debian_slim(python_version="3.10")
.apt_install("gcc", "g++", "libgomp1", "git")
## begin[broken_parallelism]
# .pip_install("numpy", "scipy", "requests", "scikit-learn", "cython")
# .run_commands("git clone https://github.com/daviddavo/lightfm.git /tmp/lightfm")
# .run_commands("cd /tmp/lightfm && python cythonize setup.py && pip install .")
## end[broken_parallelism
.pip_install("numpy", "scipy", "requests", "scikit-learn", "cython", "lightfm")
)
@app.function(image=image, cpu=4)
def benchmark():
import numpy as np
from scipy.sparse import coo_matrix
from lightfm import LightFM
import time
import os
import sys
print(f"Platform: {sys.platform}")
print(f"CPU count: {os.cpu_count()}")
# Check which module is loaded
try:
import lightfm._lightfm_fast_openmp
print("✓ Using OpenMP version")
except ImportError as e:
print(f"✗ OpenMP version not available: {e}")
try:
import lightfm._lightfm_fast_no_openmp
print("✓ Using no-OpenMP version")
except ImportError as e:
print(f"✗ No-OpenMP version not available: {e}")
# Create synthetic dataset
n_users = 30_000
n_items = 30_000
n_interactions = 300_000
users = np.random.randint(0, n_users, n_interactions)
items = np.random.randint(0, n_items, n_interactions)
data = np.ones(n_interactions)
interactions = coo_matrix((data, (users, items)), shape=(n_users, n_items))
# Benchmark
print("\nBenchmark results:")
baseline_time = None
for threads in [1, 2, 4]:
model = LightFM(no_components=128, loss="warp")
start = time.time()
model.fit(interactions, epochs=10, num_threads=threads)
elapsed = time.time() - start
if threads == 1:
baseline_time = elapsed
speedup = baseline_time / elapsed if baseline_time else 1.0
print(f"{threads} threads: {elapsed:.2f}s ({speedup:.2f}x speedup)")
return "Benchmark complete"
@app.local_entrypoint()
def main():
result = benchmark.remote()
print(result)I wish that I knew enough Cython to contribute a fix, but the best I can do right now is point that out ¯_(ツ)_/¯ |
Author
|
Me too, I just modified things until it worked, but I don't have much idea of Cython... If someone wants, feel free to fork my fork |
|
@daviddavo your fork absolutely works as others have said! We need to get this request merged! Anything we can do to get it done? |
Author
|
They should receive an email every time we comment here, but herhaps a direct e-mail would work better |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This makes it compatible with Python 3.12 and solves #709